Author

Lim Li Ying

Programming Interactive Data Visualization with R

Getting Started

Installing and loading R packages

For this exercise, besides tidyverse, four R packages will be used, namely ggiraph, plotly, DT and patchwork.

  • ggiraph: for making ggplot2 graphics interactive.
  • plotly: for plotting interactive statistical charts.
  • DT: for creating data tables.

The code chunk below uses p_load() of the pacman package to check if all the packages are installed on the computer. If they are, they will then be loaded into the R environment.

Click to show/hide the code
pacman::p_load(ggiraph, plotly, patchwork, DT, tidyverse)

Importing the data

To import the data “Exam_data.csv” into the R environment, read_csv() of the readr package is used, as seen in the code chunk below.

Click to show/hide the code
exam_data <- read_csv("data/Exam_data.csv")

Interactive data visualisation using ggiraph

Tooltip effect with tooltip aesthetic

The code chunk below is used to plot an interactive statistical graph using the tooltip feature of the ggiraph package.

Click to show/hide the code
# first plot the ggplot object using `geom_dotplot_interactive`
p <- ggplot(exam_data, aes(x = MATHS)) +
  geom_dotplot_interactive(
    aes(tooltip = ID),
    stackgroups = TRUE,
    binwidth = 1,
    method = "histodot") +
  scale_y_continuous(NULL, breaks = NULL)

# use `girafe()` of ggiraph to generate an interactive svg object
girafe(
  ggobj = p,
  width_svg = 6,
  height_svg = 6*0.618
)
Note

Hover over each data point to see the student’s ID

Displaying multiple information on tooltip

Click to show/hide the code
# customize the tooltip
exam_data$tooltip <- c(paste0(
  "Name = ", exam_data$ID,
  "\n Class = ", exam_data$CLASS))

# plot the dotplot
p <- ggplot(exam_data, aes(x = MATHS)) +
  geom_dotplot_interactive(
    aes(tooltip = exam_data$tooltip),
    stackgroups = TRUE,
    binwidth = 1,
    method = "histodot") +
  scale_y_continuous(NULL, breaks = NULL)

# use `girafe()` of ggiraph to generate an interactive svg object
girafe(
  ggobj = p,
  width_svg = 6,
  height_svg = 6*0.618
)
Note

Hover over each data point to see the student’s ID and class.

Customising tooltip style

The code chunk below uses opts_tooltip() to customise the tooltip

Click to show/hide the code
# use CSS to modify tooltip appearance
tooltip_css <- "background-color:white; #<<
font-style:bold; color:black;" #<< 

# plot the dotplot
p <- ggplot(exam_data, aes(x = MATHS)) +
  geom_dotplot_interactive(
    aes(tooltip = ID),
    stackgroups = TRUE,
    binwidth = 1,
    method = "histodot") +
  scale_y_continuous(NULL, breaks = NULL)

# use `opts_tooltip` to change tooltip appearance
girafe(
  ggobj = p,
  width_svg = 6,
  height_svg = 6*0.618,
  options = list( #<<
    opts_tooltip( #<<
      css = tooltip_css)) #<<
)
Note

Data points now have a white background and black font (black background and white font is the default).

Displaying statistics on tooltip

The code chunk below shows a more advanced way of customizing the tooltip.

Click to show/hide the code
# define a function to compute 90% coonfidence interval of the mean
tooltip <- function(y, ymax, accuracy = 0.01) {
  mean <- scales::number(y, accuracy = accuracy)
  sem <- scales::number(ymax - y, accuracy = accuracy)
  paste("Mean maths scores:", mean, "+/-", sem)
}

gg_point <- ggplot(exam_data, aes(x = RACE)) +
  stat_summary(aes(y = MATHS, tooltip = after_stat(tooltip(y,ymax))),
               fun.data = "mean_se",
               geom = GeomInteractiveCol,
               fill = "lightblue") +
  stat_summary(aes(y = MATHS),
               fun.data = mean_se,
               geom = "errorbar", width = 0.2, linewidth = 0.2)

girafe(ggobj = gg_point,
       width_svg = 8,
       height_svg = 8 * 0.618)
Note

Hover over the bars to see the mean scores.

Hover effect with data_id aesthetic

Another interactive feature of the ggiraph package is the data_id feature. The code chunk below shows an example of it.

Click to show/hide the code
p <- ggplot(exam_data, aes(x = MATHS)) +
  geom_dotplot_interactive(
    aes(data_id = CLASS),
    stackgroups = TRUE,
    binwidth = 1,
    method = "histodot") +
  scale_y_continuous(NULL, breaks = NULL)

girafe(
  ggobj = p,
  width_svg = 6,
  height_svg = 6 * 0.618
)
Note

Hovering over a data point will highly all data points in the same class.

Modifying the hover effect

The code chunk below uses css codes to modify the highlighting effect.

Click to show/hide the code
p <- ggplot(exam_data, aes(x = MATHS)) +
  geom_dotplot_interactive(
    aes(data_id = CLASS),
    stackgroups = TRUE,
    binwidth = 1,
    method = "histodot") +
  scale_y_continuous(NULL, breaks = NULL)
      
girafe(
  ggobj = p,
  width_svg = 6,
  height_svg = 6 * 0.618,
  options = list(
    opts_hover(css = 'fill: #202020;'),
    opts_hover_inv(css = "opacity: 0.2;")
  )
)

Click effect with onclick aesthetic

The onclick argument provides hotlink interactivity on the web.

Click to show/hide the code
exam_data$onclick <- sprintf("window.open(\"%s%s\")",
"https://www.moe.gov.sg/schoolfinder?journey=Primary%20school",
as.character(exam_data$ID))

p <- ggplot(exam_data, 
       aes(x = MATHS)) +
  geom_dotplot_interactive(              
    aes(onclick = onclick),              
    stackgroups = TRUE,                  
    binwidth = 1,                        
    method = "histodot") +               
  scale_y_continuous(NULL,               
                     breaks = NULL)
girafe(                                  
  ggobj = p,                             
  width_svg = 6,                         
  height_svg = 6*0.618)         
Note

Clicking on the plot will open up the MOE website.

Coordinated multiple views with ggiraph

Coordinated multiple views can be created by combining interactive functions of ggiraph with the patchwork function. There are two steps involved:

  1. Use appropriate interactive functions of the ggiraph package to create the multiple views.

  2. Use the patchwork function of the patchwork package inside the girafe() function to create the interactive coordinated views.

Click to show/hide the code
p1 <- ggplot(data=exam_data, 
       aes(x = MATHS)) +
  geom_dotplot_interactive(              
    aes(data_id = ID),              
    stackgroups = TRUE,                  
    binwidth = 1,                        
    method = "histodot") +  
  coord_cartesian(xlim=c(0,100)) + 
  scale_y_continuous(NULL,               
                     breaks = NULL)

p2 <- ggplot(data=exam_data, 
       aes(x = ENGLISH)) +
  geom_dotplot_interactive(              
    aes(data_id = ID),              
    stackgroups = TRUE,                  
    binwidth = 1,                        
    method = "histodot") + 
  coord_cartesian(xlim=c(0,100)) + 
  scale_y_continuous(NULL,               
                     breaks = NULL)

girafe(code = print(p1 + p2), 
       width_svg = 6,
       height_svg = 3,
       options = list(
         opts_hover(css = "fill: #202020;"),
         opts_hover_inv(css = "opacity:0.2;")
         )
       ) 

Interactive data visualisation using plotly

There are two ways to create interactive graphs using plotly, namely plot_ly() and ggplotly().

Creating an interactive scatter plot using plot_ly()

The code chunk below shows an example of creating a basic interactive scatter plot using plot_ly().

Click to show/hide the code
plot_ly(exam_data, 
        x = ~MATHS,
        y = ~ENGLISH)

Working with visual variables in plot_ly()

In the code chunk below, the color argument is mapped to a qualitative visual variable (in this example, the visual variable is the RACE).

Click to show/hide the code
plot_ly(exam_data, 
        x = ~MATHS,
        y = ~ENGLISH,
        color = ~RACE)

Creating an interactive scatter plot using ggplotly()

The code chunk below shows an example of creating a basic interactive scatter plot using ggplotly().

Click to show/hide the code
p <- ggplot(exam_data, 
            aes(x = MATHS,
                y = ENGLISH)) +
  geom_point(size=1) +
  coord_cartesian(xlim=c(0,100),
                  ylim=c(0,100))
ggplotly(p)
Note

Note that only an additional line ggplotly() is included.

Coordinated mulitple views with plotly

There are three steps involved in the creation of coordinated multiple views with plotly:

  1. highlight_key of the plotly package is used as shared data.

  2. Use ggplot2 functions to create 2 scatter plots.

  3. subplot() function of the plotly package is used to place the plots side-by-side.

Click to show/hide the code
d <- highlight_key(exam_data)
p1 <- ggplot(data=d, 
            aes(x = MATHS,
                y = ENGLISH)) +
  geom_point(size=1) +
  coord_cartesian(xlim=c(0,100),
                  ylim=c(0,100))

p2 <- ggplot(data=d, 
            aes(x = MATHS,
                y = SCIENCE)) +
  geom_point(size=1) +
  coord_cartesian(xlim=c(0,100),
                  ylim=c(0,100))
subplot(ggplotly(p1),
        ggplotly(p2))

Interactive data visualisation using crosstalk methods

Creating an interactive data table using DT package

The code chunk below is an example of creating of data tables.

Click to show/hide the code
DT::datatable(exam_data, class = "compact")

Linked brushing using crosstalk method

Click to show/hide the code
d <- highlight_key(exam_data) 
p <- ggplot(d, 
            aes(ENGLISH, 
                MATHS)) + 
  geom_point(size=1) +
  coord_cartesian(xlim=c(0,100),
                  ylim=c(0,100))

gg <- highlight(ggplotly(p),        
                "plotly_selected")  

crosstalk::bscols(gg,               
                  DT::datatable(d), 
                  widths = 5)  
Note

Selecting one or multiple rows from the data table will highlight the selection on the scatter plot

Programming Animated Statistical Graphics with R

Getting Started

Installing and loading R packages

For this exercise, in addition to plotly and tidyverse, the following packages are used:

  • readxl: for importing excel worksheets.

  • gifski: for creating animated GIFs.

  • gapminder: for this exercise, the purpose is to use the country_colors scheme.

  • gganimate: for creating animated statistical graphs.

Click to show/hide the code
pacman::p_load(readxl, gifski, gapminder,
               plotly, gganimate, tidyverse)

Importing the data

Click to show/hide the code
col <- c("Country", "Continent")
globalPop <- read_xls("data/GlobalPopulation.xls",
                      sheet="Data") %>%
  mutate_each_(funs(factor(.)), col) %>%
  mutate(Year = as.integer(Year))
Note
  • mutate_each_() of dplyr package is used to convert all character data type into factor.

  • mutate() is used to convert Year into integer.

Animated data visualization using gganimate methods

Building a static bubble plot

The code chunk below uses basic ggplot functions to build a static bubble plot.

Click to show/hide the code
ggplot(globalPop, aes(x = Old, y = Young, 
                      size = Population, 
                      colour = Country)) +
  geom_point(alpha = 0.7, 
             show.legend = FALSE) +
  scale_colour_manual(values = country_colors) +
  scale_size(range = c(2, 12)) +
  labs(title = 'Year: {frame_time}', 
       x = '% Aged', 
       y = '% Young')

Building an animated bubble plot

The code chunk below uses transition_time() of gganimate to create transition through distinct states in time (i.e. Years), and ease_aes() to control easing of aesthetics, with the default being linear.

Click to show/hide the code
ggplot(globalPop, aes(x = Old, y = Young, 
                      size = Population, 
                      colour = Country)) +
  geom_point(alpha = 0.7, 
             show.legend = FALSE) +
  scale_colour_manual(values = country_colors) +
  scale_size(range = c(2, 12)) +
  labs(title = 'Year: {frame_time}', 
       x = '% Aged', 
       y = '% Young') +
  transition_time(Year) +       
  ease_aes('linear')    

Animated data visualization using plotly methods

Building an animated bubble plot using ggplotly()

The code chunk below demonstrated how to build an animate bubble plot using ggplotly() instead.

Click to show/hide the code
gg <- ggplot(globalPop, 
       aes(x = Old, 
           y = Young, 
           size = Population, 
           colour = Country)) +
  geom_point(aes(size = Population,
                 frame = Year),
             alpha = 0.7, 
             show.legend = FALSE) +
  scale_colour_manual(values = country_colors) +
  scale_size(range = c(2, 12)) +
  labs(x = '% Aged', 
       y = '% Young')

ggplotly(gg)
Note
  • A static bubble plot is first created using the appropriate ggplot2 functions and then saved as an R object called gg.

  • ggplotly() is then used to convert the R object into an animated svg object.

Building an animated bubble plot using plot_ly()

Click to show/hide the code
bp <- globalPop %>%
  plot_ly(x = ~Old, 
          y = ~Young, 
          size = ~Population, 
          color = ~Continent, 
          frame = ~Year, 
          text = ~Country, 
          hoverinfo = "text",
          type = 'scatter',
          mode = 'markers'
          )
bp